23 research outputs found
Modelling Censored Losses Using Splicing: a Global Fit Strategy With Mixed Erlang and Extreme Value Distributions
In risk analysis, a global fit that appropriately captures the body and the
tail of the distribution of losses is essential. Modelling the whole range of
the losses using a standard distribution is usually very hard and often
impossible due to the specific characteristics of the body and the tail of the
loss distribution. A possible solution is to combine two distributions in a
splicing model: a light-tailed distribution for the body which covers light and
moderate losses, and a heavy-tailed distribution for the tail to capture large
losses. We propose a splicing model with a mixed Erlang (ME) distribution for
the body and a Pareto distribution for the tail. This combines the flexibility
of the ME distribution with the ability of the Pareto distribution to model
extreme values. We extend our splicing approach for censored and/or truncated
data. Relevant examples of such data can be found in financial risk analysis.
We illustrate the flexibility of this splicing model using practical examples
from risk measurement
Sparse Regression with Multi-type Regularized Feature Modeling
Within the statistical and machine learning literature, regularization
techniques are often used to construct sparse (predictive) models. Most
regularization strategies only work for data where all predictors are treated
identically, such as Lasso regression for (continuous) predictors treated as
linear effects. However, many predictive problems involve different types of
predictors and require a tailored regularization term. We propose a multi-type
Lasso penalty that acts on the objective function as a sum of subpenalties, one
for each type of predictor. As such, we allow for predictor selection and level
fusion within a predictor in a data-driven way, simultaneous with the parameter
estimation process. We develop a new estimation strategy for convex predictive
models with this multi-type penalty. Using the theory of proximal operators,
our estimation procedure is computationally efficient, partitioning the overall
optimization problem into easier to solve subproblems, specific for each
predictor type and its associated penalty. Earlier research applies
approximations to non-differentiable penalties to solve the optimization
problem. The proposed SMuRF algorithm removes the need for approximations and
achieves a higher accuracy and computational efficiency. This is demonstrated
with an extensive simulation study and the analysis of a case-study on
insurance pricing analytics
Estimating the maximum possible earthquake magnitude using extreme value methodology: the Groningen case
The area-characteristic, maximum possible earthquake magnitude is
required by the earthquake engineering community, disaster management agencies
and the insurance industry. The Gutenberg-Richter law predicts that earthquake
magnitudes follow a truncated exponential distribution. In the geophysical
literature several estimation procedures were proposed, see for instance Kijko
and Singh (Acta Geophys., 2011) and the references therein. Estimation of
is of course an extreme value problem to which the classical methods for
endpoint estimation could be applied. We argue that recent methods on truncated
tails at high levels (Beirlant et al., Extremes, 2016; Electron. J. Stat.,
2017) constitute a more appropriate setting for this estimation problem. We
present upper confidence bounds to quantify uncertainty of the point estimates.
We also compare methods from the extreme value and geophysical literature
through simulations. Finally, the different methods are applied to the
magnitude data for the earthquakes induced by gas extraction in the Groningen
province of the Netherlands
Extreme Value Theory in Finance and Insurance
When modelling high-dimensional data, dimension reduction techniques such as principal component analysis are often used. In the first part of this thesis we will focus on two drawbacks of classical PCA. First, interpretation of classical PCA is often challenging because most of the loadings are neither very small nor very large in absolute value. Second, classical PCA can be heavily distorted by outliers since it is based on the classical covariance matrix. In order to resolve both problems, we present a new PCA algorithm that is robust against outliers and yields sparse PCs, i.e. PCs with many zero loadings. The approach is based on the ROBPCA algorithm that generates robust but non-sparse loadings. The construction of the new ROSPCA method is detailed, as well as a selection criterion for the sparsity parameter. An extensive simulation study and a real data example are performed, showing that it is capable of accurately finding the sparse structure of datasets, even when challenging outliers are present.
Stock market crashes such as Black Monday in 1987 and catastrophes such as earthquakes are examples of extreme events in finance and insurance, respectively. They are large events with a considerable impact that occur seldom. Extreme value theory (EVT) provides a theoretical framework to model extreme values such that e.g. risk measures can be estimated based on available data. In the second part of this PhD thesis we focus on applications of EVT that are of interest to finance and insurance.
A Black Swan is an improbable event with massive consequences. We propose a way to investigate if the 2007-2008 financial crisis was a Black Swan event for a given bank based on weekly log-returns. This is done by comparing the tail behaviour of the negative log-returns before and after the crisis using techniques from extreme value methodology. We illustrate this approach with Barclays and Credit Suisse data, and then link the differences in tail risk behaviour between these banks with economic indicators.
The earthquake engineering community, disaster management agencies and the insurance industry need models for earthquake magnitudes to predict possible damage by earthquakes. A crucial element in these models is the area-characteristic, maximum possible earthquake magnitude. The Gutenberg-Richter distribution, which is a (doubly) truncated exponential distribution, is widely used to model earthquake magnitudes. Recently, Aban et al. (2006) and Beirlant et al. (2016) discussed tail fitting for truncated Pareto-type distributions. However, as is the case for the Gutenberg-Richter distribution, in some applications the underlying distribution appears to have a lighter tail than the Pareto distribution. We generalise the classical peaks over threshold (POT) approach to allow for truncation effects. This enables a unified treatment of extreme value analysis for truncated heavy and light tails. We use a pseudo maximum likelihood approach to estimate the model parameters and consider extreme quantile estimation. The new approach is illustrated on examples from hydrology and geophysics. Moreover, we perform simulations to illustrate the potential of the method on truncated heavy and light tails.
The new approach can then be used to estimate the maximum possible earthquake magnitude. We also look at two other EVT-based endpoint estimators and endpoint estimators that are used in the geophysical literature. To quantify uncertainty of the point estimates for the endpoint, upper confidence bounds are also considered. We apply the techniques to provide estimates, and upper confidence bounds, for the maximum possible earthquake magnitude in Groningen where earthquakes are induced by gas extraction. Furthermore, we compare the methods from extreme value theory and the geophysical literature through simulations.
In risk analysis, a global fit that appropriately captures the body and the tail of the distribution of losses is essential. Modelling the whole range of the losses using a standard distribution is usually very hard and often impossible due to the specific characteristics of the body and the tail of the loss distribution. A possible solution is to combine two distributions in a splicing model: a light-tailed distribution for the body which covers light and moderate losses, and a heavy-tailed distribution for the tail to capture large losses. We propose a splicing model with the flexible mixed Erlang distribution for the body and a Pareto distribution for the tail. Motivated by examples in financial risk analysis, we extend our splicing approach to censored and/or truncated data. We illustrate the flexibility of this splicing model using practical examples from reinsurance.status: publishe
Fitting tails affected by truncation
© 2017, Institute of Mathematical Statistics. All rights reserved. In several applications, ultimately at the largest data, truncation effects can be observed when analysing tail characteristics of statistical distributions. In some cases truncation effects are forecasted through physical models such as the Gutenberg-Richter relation in geophysics, while at other instances the nature of the measurement process itself may cause under recovery of large values, for instance due to flooding in river discharge readings. Recently, Beirlant, Fraga Alves and Gomes (2016) discussed tail fitting for truncated Pareto-type distributions. Using examples from earthquake analysis, hydrology and diamond valuation we demonstrate the need for a unified treatment of extreme value analysis for truncated heavy and light tails. We generalise the classical Peaks over Threshold approach for the different max-domains of attraction with shape parameter ξ > −1/2 to allow for truncation effects. We use a pseudo maximum likelihood approach to estimate the model parameters and consider extreme quantile estimation and reconstruction of quantile levels before truncation whenever appropriate. We report on some simulation experiments and provide some basic asymptotic results.status: publishe
Sparse PCA for high-dimensional data with outliers
© 2016 American Statistical Association and the American Society for Quality. A new sparse PCA algorithm is presented, which is robust against outliers. The approach is based on the ROBPCA algorithm that generates robust but nonsparse loadings. The construction of the new ROSPCA method is detailed, as well as a selection criterion for the sparsity parameter. An extensive simulation study and a real data example are performed, showing that it is capable of accurately finding the sparse structure of datasets, even when challenging outliers are present. In comparison with a projection pursuit-based algorithm, ROSPCA demonstrates superior robustness properties and comparable sparsity estimation capability, as well as significantly faster computation time.peerreview_statement: The publishing and review policy for this title is described in its Aims & Scope.
aims_and_scope_url: http://www.tandfonline.com/action/journalInformation?show=aimsScope&journalCode=utch20status: publishe